AITopics | multi-task training

Collaborating Authors

multi-task training

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Open-Book Neural Algorithmic Reasoning

Neural Information Processing SystemsMar-18-2026, 06:45:45 GMT

Neural algorithmic reasoning is an emerging area of machine learning that focuses on building neural networks capable of solving complex algorithmic tasks. Recent advancements predominantly follow the standard supervised learning paradigm -- feeding an individual problem instance into the network each time and training it to approximate the execution steps of a classical algorithm. We challenge this mode and propose a novel open-book learning framework. In this framework, whether during training or testing, the network can access and utilize all instances in the training dataset when reasoning for a given instance.Empirical evaluation is conducted on the challenging CLRS Algorithmic Reasoning Benchmark, which consists of 30 diverse algorithmic tasks. Our open-book learning framework exhibits a significant enhancement in neural reasoning capabilities. Further, we notice that there is recent literature suggesting that multi-task training on CLRS can improve the reasoning accuracy of certain tasks, implying intrinsic connections between different algorithmic tasks.

artificial intelligence, machine learning, proceedings, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.56)

Add feedback

Multi-Head Adapter Routing for Cross-Task Generalization

Neural Information Processing SystemsDec-26-2025, 14:11:25 GMT

Parameter-efficient fine-tuning (PEFT) for cross-task generalization consists in pre-training adapters on a multi-task training set before few-shot adaptation to test tasks. Polytropon [Ponti et al., 2023] ($\texttt{Poly}$) jointly learns an inventory of adapters and a *routing* function that selects a (variable-size) subset of adapters for each task during both pre-training and few-shot adaptation. In this paper, we investigate the role that adapter routing plays in its success and design new variants based on our findings.First, we build on the intuition that finer-grained routing provides more expressivity. Hence,we propose $\texttt{MHR}$ (Multi-Head Routing) which combines *subsets* of adapter parameters and outperforms $\texttt{Poly}$ under a comparable parameter budget; by only fine-tuning the routing function and not the adapters ($\texttt{MHR}$-$z$) we achieve competitive performance with extreme parameter efficiency. Second, we find that $\texttt{Poly}$/$\texttt{MHR}$ performance is a result of better multi-task optimization, rather than modular inductive biases that facilitate adapter recombination and local adaptation, as previously hypothesized.

adapter, multi-head adapter routing, texttt, (10 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

12ffe4499085e9a51beb02441212e26b-Paper-Conference.pdf

Neural Information Processing SystemsOct-9-2025, 18:58:44 GMT

dataset, experiment, multi-task training, (16 more...)

Neural Information Processing Systems

Country:

Asia > China > Shanghai > Shanghai (0.04)
Asia > China > Anhui Province > Hefei (0.04)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.93)
Workflow (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
(3 more...)

Add feedback

ThanoRA: Task Heterogeneity-Aware Multi-Task Low-Rank Adaptation

Liang, Jian, Huang, Wenke, Guo, Xianda, Wan, Guancheng, Du, Bo, Ye, Mang

arXiv.org Artificial IntelligenceSep-30-2025

Low-Rank Adaptation (LoRA) is widely adopted for downstream fine-tuning of foundation models due to its efficiency and zero additional inference cost. Many real-world applications require foundation models to specialize in several specific tasks simultaneously, motivating the need for efficient multi-task downstream adaptation. To address this need, existing studies have primarily explored two directions: Model Merging with LoRA, which shows advantages in training-free scenarios but still lags behind multi-task training in overall performance; and MoE-based LoRA approaches, which improve multi-task learning performance but introduce routers that hinder the mergeability of LoRA parameters and incur considerable inference overhead, thereby limiting real-world deployment practicality. To this end, we propose ThanoRA, a Task Heterogeneity-Aware Multi-Task Low-Rank Adaptation framework that enables effective, efficient and unified multi-task downstream adaptation without introducing additional structure. ThanoRA performs multi-task learning by tailoring subspace allocation at initialization and enforcing diversity preservation throughout training: it allocates varying dimensions to construct task-specific low-rank subspaces driven by inter-task heterogeneity, enabling fine-grained knowledge injection, while diversity-preserving regularization mitigates task interference and subspace collapse, thereby fully exploiting the low-rank capacity. Extensive experiments across multimodal and text-only benchmarks under varying multi-task mixtures demonstrate that ThanoRA consistently outperforms strong baselines, surpassing even separate task-specific fine-tuning, while introducing no additional structures or inference overhead. Our code will be publicly available at: https://github.com/LiangJian24/ThanoRA.

artificial intelligence, arxiv preprint arxiv, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2505.1864

Genre: Research Report > New Finding (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

a10c3d85879c29331ba4d73ede56c7d3-Paper-Conference.pdf

Neural Information Processing SystemsSep-28-2025, 22:38:39 GMT

machine learning, reinforcement learning, variation, (15 more...)

Neural Information Processing Systems

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.28)

Genre: Research Report > Experimental Study (0.93)

Industry:

Education (1.00)
Transportation > Ground > Road (0.69)
Transportation > Infrastructure & Services (0.48)
Energy > Oil & Gas (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Robots (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
(3 more...)

Add feedback

Model-Based Transfer Learning for Contextual Reinforcement Learning

Neural Information Processing SystemsAug-21-2025, 05:49:18 GMT

Deep reinforcement learning (RL) is a powerful approach to complex decision making. However, one issue that limits its practical application is its brittleness, sometimes failing to train in the presence of small changes in the environment.

machine learning, reinforcement learning, variation, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.28)
Europe > Netherlands (0.14)
Asia > Taiwan (0.14)
Asia > Middle East > Republic of Türkiye (0.14)

Genre: Research Report > Experimental Study (0.93)

Industry:

Education (1.00)
Transportation > Ground > Road (0.69)
Transportation > Infrastructure & Services (0.48)
Energy > Oil & Gas (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Open-Book Neural Algorithmic Reasoning

Neural Information Processing SystemsMay-26-2025, 16:42:06 GMT

algorithmic reasoning, artificial intelligence, machine learning, (4 more...)

Neural Information Processing Systems

Country: Asia > China > Anhui Province > Hefei (0.08)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.60)

Add feedback

Multi-Head Adapter Routing for Cross-Task Generalization

Neural Information Processing SystemsJan-19-2025, 19:33:18 GMT

Parameter-efficient fine-tuning (PEFT) for cross-task generalization consists in pre-training adapters on a multi-task training set before few-shot adaptation to test tasks. Polytropon [Ponti et al., 2023] ( \texttt{Poly}) jointly learns an inventory of adapters and a *routing* function that selects a (variable-size) subset of adapters for each task during both pre-training and few-shot adaptation. In this paper, we investigate the role that adapter routing plays in its success and design new variants based on our findings.First, we build on the intuition that finer-grained routing provides more expressivity. Hence,we propose \texttt{MHR} (Multi-Head Routing) which combines *subsets* of adapter parameters and outperforms \texttt{Poly} under a comparable parameter budget; by only fine-tuning the routing function and not the adapters ( \texttt{MHR} - z) we achieve competitive performance with extreme parameter efficiency. Second, we find that \texttt{Poly} / \texttt{MHR} performance is a result of better multi-task optimization, rather than modular inductive biases that facilitate adapter recombination and local adaptation, as previously hypothesized.

adapter, multi-head adapter routing, texttt, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Open-Book Neural Algorithmic Reasoning

Li, Hefei, Peng, Chao, Xu, Chenyang, Yang, Zhengfeng

arXiv.org Artificial IntelligenceDec-29-2024

Neural algorithmic reasoning is an emerging area of machine learning that focuses on building neural networks capable of solving complex algorithmic tasks. Recent advancements predominantly follow the standard supervised learning paradigm -- feeding an individual problem instance into the network each time and training it to approximate the execution steps of a classical algorithm. We challenge this mode and propose a novel open-book learning framework. In this framework, whether during training or testing, the network can access and utilize all instances in the training dataset when reasoning for a given instance. Empirical evaluation is conducted on the challenging CLRS Algorithmic Reasoning Benchmark, which consists of 30 diverse algorithmic tasks. Our open-book learning framework exhibits a significant enhancement in neural reasoning capabilities. Further, we notice that there is recent literature suggesting that multi-task training on CLRS can improve the reasoning accuracy of certain tasks, implying intrinsic connections between different algorithmic tasks. We delve into this direction via the open-book framework. When the network reasons for a specific task, we enable it to aggregate information from training instances of other tasks in an attention-based manner. We show that this open-book attention mechanism offers insights into the inherent relationships among various tasks in the benchmark and provides a robust tool for interpretable multi-task training.

artificial intelligence, inductive learning, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2501.00072

Country:

Asia > China > Shanghai > Shanghai (0.04)
Asia > China > Anhui Province > Hefei (0.04)

Genre:

Research Report (1.00)
Workflow (0.88)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)

Add feedback

MARE: Multi-Aspect Rationale Extractor on Unsupervised Rationale Extraction

Jiang, Han, Duan, Junwen, Qu, Zhe, Wang, Jianxin

arXiv.org Artificial IntelligenceOct-4-2024

Unsupervised rationale extraction aims to extract text snippets to support model predictions without explicit rationale annotation. Researchers have made many efforts to solve this task. Previous works often encode each aspect independently, which may limit their ability to capture meaningful internal correlations between aspects. While there has been significant work on mitigating spurious correlations, our approach focuses on leveraging the beneficial internal correlations to improve multi-aspect rationale extraction. In this paper, we propose a Multi-Aspect Rationale Extractor (MARE) to explain and predict multiple aspects simultaneously. Concretely, we propose a Multi-Aspect Multi-Head Attention (MAMHA) mechanism based on hard deletion to encode multiple text chunks simultaneously. Furthermore, multiple special tokens are prepended in front of the text with each corresponding to one certain aspect. Finally, multi-task training is deployed to reduce the training overhead. Experimental results on two unsupervised rationale extraction benchmarks show that MARE achieves state-of-the-art performance. Ablation studies further demonstrate the effectiveness of our method. Our codes have been available at https://github.com/CSU-NLP-Group/MARE.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2410.03531

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Texas (0.04)
North America > United States > New York > New York County > New York City (0.04)
(6 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback